Goto

Collaborating Authors

 scientific ai


Towards a Certificate of Trust: Task-Aware OOD Detection for Scientific AI

Raonić, Bogdan, Mishra, Siddhartha, Lanthaler, Samuel

arXiv.org Artificial Intelligence

Data-driven models are increasingly adopted in critical scientific fields like weather forecasting and fluid dynamics. These methods can fail on out-of-distribution (OOD) data, but detecting such failures in regression tasks is an open challenge. We propose a new OOD detection method based on estimating joint likelihoods using a score-based diffusion model. This approach considers not just the input but also the regression model's prediction, providing a task-aware reliability score. Across numerous scientific datasets, including PDE datasets, satellite imagery and brain tumor segmentation, we show that this likelihood strongly correlates with prediction error. Our work provides a foundational step towards building a verifiable 'certificate of trust', thereby offering a practical tool for assessing the trustworthiness of AI-based scientific predictions. Our code is publicly available at https://github.com/bogdanraonic3/OOD_Detection_ScientificML


Data Readiness for Scientific AI at Scale

Brewer, Wesley, Widener, Patrick, Anantharaj, Valentine, Wang, Feiyi, Beck, Tom, Shankar, Arjun, Oral, Sarp

arXiv.org Artificial Intelligence

This paper examines how Data Readiness for AI (DRAI) principles apply to leadership-scale scientific datasets used to train foundation models. We analyze archetypal workflows across four representative domains - climate, nuclear fusion, bio/health, and materials - to identify common preprocessing patterns and domain-specific constraints. We introduce a two-dimensional readiness framework composed of Data Readiness Levels (raw to AI-ready) and Data Processing Stages (ingest to shard), both tailored to high performance computing (HPC) environments. This framework outlines key challenges in transforming scientific data for scalable AI training, emphasizing transformer-based generative models. Together, these dimensions form a conceptual maturity matrix that characterizes scientific data readiness and guides infrastructure development toward standardized, cross-domain support for scalable and reproducible AI for science.


SciAI4Industry -- Solving PDEs for industry-scale problems with deep learning

Witte, Philipp A., Hewett, Russell J., Saurabh, Kumar, Sojoodi, AmirHossein, Chandra, Ranveer

arXiv.org Artificial Intelligence

Solving partial differential equations with deep learning makes it possible to reduce simulation times by multiple orders of magnitude and unlock scientific methods that typically rely on large numbers of sequential simulations, such as optimization and uncertainty quantification. Two of the largest challenges of adopting scientific AI for industrial problem settings is that training datasets must be simulated in advance and that neural networks for solving large-scale PDEs exceed the memory capabilities of current GPUs. We introduce a distributed programming API in the Julia language for simulating training data in parallel on the cloud and without requiring users to manage the underlying HPC infrastructure. In addition, we show that model-parallel deep learning based on domain decomposition allows us to scale neural networks for solving PDEs to commercial-scale problem settings and achieve above 90% parallel efficiency. Combining our cloud API for training data generation and model-parallel deep learning, we train large-scale neural networks for solving the 3D Navier-Stokes equation and simulating 3D CO2 flow in porous media. For the CO2 example, we simulate a training dataset based on a commercial carbon capture and storage (CCS) project and train a neural network for CO2 flow simulation on a 3D grid with over 2 million cells that is 5 orders of magnitudes faster than a conventional numerical simulator and 3,200 times cheaper.


What is Scientific AI?

#artificialintelligence

While chatbots and self-driving cars are some of the most widely recognised applications for artificial intelligence (AI), machine-learning technologies have made a decisive mark in the scientific world. Read on to discover more about scientific AI and the implications it has on everything from genomics to drug development. Technological advances have allowed scientists and researchers to unlock extraordinary amounts of data. Though without enough processing power, this data is meaningless. This is where artificial intelligence steps up.


What is Scientific AI?

#artificialintelligence

While chatbots and self-driving cars are some of the most widely recognised applications for artificial intelligence (AI), machine-learning technologies have made a decisive mark in the scientific world. Read on to discover more about scientific AI and the implications it has on everything from genomics to drug development. Technological advances have allowed scientists and researchers to unlock extraordinary amounts of data. Though without enough processing power, this data is meaningless. This is where artificial intelligence steps up.


Pinaki Laskar on LinkedIn: #AI #algorithms #machinelearning

#artificialintelligence

AI Researcher, Cognitive Technologist Inventor - AI Thinking, Think Chain Innovator - AIOT, XAI, Autonomous Cars, IIOT Founder Fisheyebox Spatial Computing Savant, Transformative Leader, Industry X.0 Practitioner What is Real #AI for Everything and Everybody Platform? The mainstream human-centric AI has some fundamental problems needing for fundamental solutions. First, it is philosophy, or rather lack of any philosophy, and blindly relying on statistics, its processes, algorithms, and inductive inferences, needing a large volume of big data as the "fuel" to train the model for the special tasks of the classifications and the predictions in very specific cases. Second, it is not a scientific AI agreed with the rules, principles, and method of science. Today's AI is failing to deal with reality and its causality and mentality strictly following a scientific method of inquiry depending upon the reciprocal interaction of generalizations (hypothesis, laws, theories, and models) and observable/experimental data. Third, there is no common definition of AI, and each one sees AI in its own way.


Scientific AI in materials science: a path to a sustainable and scalable paradigm - IOPscience

#artificialintelligence

Recent reports, reviews, symposia, and workshops have heralded machine learning (ML) and artificial intelligence (AI) methods as the next scientific paradigm in materials discovery and optimization [1–5]. Applications to materials science have exploded, spanning data analysis, knowledge extraction, and experiment selection [1, 6–9]. The numerous reasons for this trend are related to the omnipresence of ML systems in our everyday lives, the free availability software, and the demonstrated successes in materials discovery and on-the-fly data acquisition inspired by the Materials Genome Initiative (MGI) [1, 10–12]. However, despite their recent prominence, these techniques have been applied in a variety of materials science fields since the early 1960's [13–17]. Some recent examples of the successful implementation of ML to materials science were demonstrated by the high-throughput experimental (HTE, also known as'combinatorial') community. Parallel material synthesis and rapid characterization introduces a critical bottleneck in the analysis of hundreds to thousands of high-quality measurements correlated in composition, processing and microstructure [18–21]. There have been several international efforts to standardize data formats and create data analysis and interpretation tools for large scale data sets [22–24].